Transcriptomic profiling identifies a nucleotide metabolism-related signature with prognostic power in gliomas

Highlights • Nucleotide metabolism-related gene set could classify the glioma patients with different prognosis.• We developed a nucleotide metabolism-related signature to divide patients into high-risk or low-risk group.• There were significant differences in clinical and pathological features, genomic alterations and prognosis between the two groups.• Our signature was an independent prognostic indicator of gliomas, and we constructed a nomogram model for individualized survival prediction.• Nucleotide metabolism was not only related to cell division and cell cycle, but also affected the immune microenvironment in gliomas.


Introduction
Gliomas are the most common malignant primary brain tumor in adults and arise from glial or precursor cells [1].In adults, the disease mainly consists of diffuse tumors ranging from grade 2 to 4. Currently, the treatment of gliomas requires a multidisciplinary approach, including surgical resection, radiotherapy, systemic treatment and supportive care [2].Unfortunately, despite aggressive treatment, most patients with glioma have an extremely poor prognosis.In the most common type, glioblastoma (GBM, grade 4), only 5.5 % of patients survive more than five years [3].In order to address the limitations of current treatment strategies, we need to further explore the pathogenesis of gliomas and develop new therapies to improve the prognosis of patients.
Nucleotides are necessary in various biological processes, both for DNA replication and RNA synthesis.Meanwhile, many metabolic central molecules are also nucleotides or contain nucleotide fragments, such as ATP, NADH, CoA, and so on [4].In cancer, mutations and genomic aberrations that promote biosynthesis, together with changes in metabolic enzyme expression, lead to elevated levels of nucleotide synthesis [5].Increased nucleotide metabolism is a hallmark of cancer and nucleotides are mainly used as raw materials for nucleic acid synthesis to support the uncontrolled proliferation of tumor cells [6].In addition to their proliferative role in tumors, purine molecules have been well recognized as purinergic signaling ligands that mediate proliferation, differentiation and apoptosis of tumor cells [7].And pyrimidine metabolism can also maintain epithelial-to-mesenchymal transition phenotype in tumors and influence metastasis [8].Inhibition of nucleotide metabolic pathways has long been considered as a therapeutic strategy for cancer.Purine and pyrimidine analogs are used in cancer treatment, and they are incorporated into DNA and inhibit the activity of DNA polymerase [9].Some classical drugs, such as 6-MP, methotrexate and 5-fluorouracil, are used for cancer treatment by inhibiting enzymes involved in nucleotide metabolism [10,11].Exploring and clarifying the complex mechanism of nucleotide metabolism in tumors will facilitate the development of nucleotide metabolism-specific drugs and improve the survival and prognosis of tumors, including gliomas.
In this study, we comprehensively analyzed the nucleotide metabolism-related gene set and found that they could classify the glioma patients with different prognosis.Then we developed a nucleotide metabolism-related signature to divide patients into high-risk or lowrisk group.There were significant differences in clinical and pathological features, genomic alterations and prognosis between the two groups.Our signature was an independent prognostic indicator of gliomas, and we constructed a nomogram model for individualized survival prediction.Functional analysis showed that nucleotide metabolism was not only related to cell division and cell cycle, but also affected the immune microenvironment in gliomas.

Consensus clustering
Two nucleotide metabolism-related gene sets (REAC-TOME_METABOLISM_OF_NUCLEOTIDES and MODULE_337) were obtained from the Molecular Signatures Database v7.4 (MSigDB, https:// www.gsea-msigdb.org/gsea/msigdb/).After the elimination of duplicate genes, 155 nucleotide metabolism-related genes were used for the subsequent analysis.We identified the most variable genes by the median absolute deviation (MAD > 0.5) and consensus clustering analysis was performed with R package "ConsensusClusterPlus".

Gene signature identification
In the TCGA training set, univariate Cox regression analysis was performed to evaluate the prognostic value of nucleotide metabolismrelated genes, and 115 genes related to survival (p < 0.05 by Wald test) were selected.With the selected genes, the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm [18] generated a Cox model including 17 genes, which had the minimum average cross-validation error.Our signature risk score was developed from a linear combination of expression levels (expr) of 17 genes, which were weighted by their LASSO regression coefficients: Risk Score = (expr gene1 x coefficient gene1 ) + (expr gene2 x coefficient gene2 ) + … + (expr gene17 x coefficient gene17 ).The same formula and regression coefficients were used to calculate the risk score of three validation sets.

Bioinformatic analysis
Univariate and multivariate Cox regression analysis were performed to evaluate the prognostic value of variables including our signature.Variance inflation factor (VIF) was used for collinearity diagnosis, and variables without collinearity (VIF < 5) were included in the regression analysis.Time-dependent ROC curve (timeROC) was applied to predict one-, three-and five-year overall survival with R package "timeROC" [19,20].The individualized prediction model nomogram integrated signature and clinical indicators with R package "rms" [21].Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed with genes that were significantly positively correlated with signature (Pearson R > 0.6, p < 0.05) in DAVID Bioinformatics Resources 6.8 (https://david.ncifcrf.gov/).Gene set enrichment analysis (GSEA) was performed with R package "fgsea".ESTIMATE algorithm was performed to calculate immune and stromal scores for gliomas [22].CIBERSORT algorithm was performed to evaluate the proportion of 22 immune cell types in gliomas [23,24].

Single-cell RNA sequencing
We collected two samples of primary IDH-wildtype glioblastoma patients from Beijing Tiantan Hospital, Capital Medical University.Fresh glioma samples collected during surgery were immediately transported to the laboratory and processed with enzyme digestion and red blood cell lysis to produce single-cell suspensions.According to the manufacture's introduction, the cDNA libraries were constructed with Chromium Single Cell 3′ Library and Gel Bead kit v2 (120267, 10x Genomics).The libraries were sequenced on Illumina HiSeq platform and the raw data was preprocessed with Cell Ranger pipeline (v3.0.2, 10X Genomics), mapped to the hg19 reference genome.Cells in gene-cell matrices with fewer than 200 transcripts and genes with fewer than two counts in two cells were filtered and removed.The matrix was then normalized such that the number of Unique Molecular identifiers (UMIs) in each cell was equal to the median UMI count across the dataset and log-transformed.Gene-barcode matrices were analyzed with the R package "Seurat" (version 5) [25].Based on QC metrics in the standard pre-processing workflow for scRNA-seq data, we filtered cells that had unique feature counts over 4000 or less than 200, UMI counts over 30, 000 and 20 % mitochondrial counts.To reduce the gene expression matrix to its most important features, we used principal component analysis (PCA) to decrease the dimensionality of the dataset.To visualize data in 2-D space, we passed the PCA-reduced data into UMAP (uniform manifold approximation and projection), a non-linear dimensional reduction method.

Statistical analysis
R language (https://www.r-project.org/) was the main statistical and graphic environment in our study.Differences in variables between groups were assessed by Student's t-test or Chi-square test.The Kaplan-Meier survival curve was evaluated by log-rank test.Other R packages for drawing included pheatmap, gglpot2, pROC, ComplexHeatmap, and circlize.p value less than 0.05 was statistically significant.

Identification of a nucleotide metabolism-related signature in gliomas
We first explored the relationship between nucleotide metabolism status and glioma prognosis.Consensus clustering was performed using nucleotide metabolism-related gene sets, and TCGA patients could be divided into two robust clusters (Fig. 1A-C).The heatmap showed that the expression of nucleotide metabolism-related genes was significantly different between the two clusters (Fig. 1D).Meanwhile, the survival curve showed that the overall survival of patients in cluster 1 was significantly shorter than that in cluster 2 (p < 0.05, Fig. 1E).These results indicated that nucleotide metabolism status was closely related to the prognosis of patients with glioma.
To identify a nucleotide metabolism-related signature, 115 genes associated with overall survival were screened by univariate Cox regression analysis in TCGA dataset.By LASSO algorithm, the 17 most valuable predictive genes were selected as active covariates (Fig. 2A).Our nucleotide metabolism-related signature (risk score) was identified with a linear combination of expression of 17 genes weighted by their regression coefficients (Fig. 2B).Based on the median risk score as the cutoff value, we divided TCGA patients into high-risk and low-risk groups and compared clinical and pathological differences between the two groups (Fig. 2C).We found that the high-risk group was strongly linked with older age, GBM (glioblastoma, WHO grade 4), IDH-wildtype, 1p/19q non-codeletion, MGMT promoter unmethylation and classical or mesenchymal subtypes (p < 0.001).These findings were validated in three other three datasets (Fig. 2D and Figure S1).

Association between nucleotide metabolism-related signature and pathological features in gliomas
Since gliomas contained different pathological types, we then analyzed the differences in risk score.We found that risk score increased significantly with the increase of WHO grade from 2 to 4 (p < 0.05, Fig. 3A).Risk score was significantly higher in patients with IDHwildtype, 1p/19q non-codeletion, MGMT promoter unmethylation, or TERT promoter mutation (p < 0.05, Fig. 3B-D, Figure S2C).Similar differences in risk score were observed in the other two validation sets (Figure S2A and B).Among TCGA molecular subtypes, mesenchymal subtype, which was associated with poor prognosis, had the highest risk score (Fig. 3E and Figure S2D).This suggested that risk score might be a biomarker for mesenchymal subtype.We further evaluated signature's ability to predict the mesenchymal subtype with Receiver Operating Characteristic (ROC) curve and the area under curve (AUC) was 0.905, 0.903, 0.822 and 0.871 in four datasets, respectively (Fig. 3F and Figure S2E).

Association between nucleotide metabolism-related signature and genomic alterations
To reveal the molecular mechanism underlying nucleotide metabolism-related signature, we collected somatic mutation and copynumber alterations (CNA) data from TCGA dataset.In the low-risk group, we observed significant enrichment of IDH1, ATRX, CIC, FUBP1 and NOTCH1 mutations (p < 0.05, Fig. 4A).Meanwhile, high frequency of mutations in EGFR, PTEN, NF1, PDGFRA and RB1 were observed in the high-risk group (p < 0.05, Fig. 4B).In addition to somatic mutations, we also analyzed copy number alterations.The high-risk group showed more frequently deleted regions such as CDKN2A, CDKN2B, MLLT3, PTEN, and more amplified regions such as EGFR, CDK4, PDGFRA, MDM2 (p < 0.05, Fig. 4).These findings suggested higher genomic alterations in the high-risk group.

Prognostic analysis of nucleotide metabolism-related signature
To evaluate the prognostic value of nucleotide metabolism-related signature, we performed the Kaplan-Meier survival curve analysis.As shown in Fig. 5, patients in the high-risk group had significantly shorter overall survival than those in the low-risk group of four datasets (p < 0.001 by log-rank test).Due to the heterogeneity of different grades of gliomas, we analyzed survival curves in grade 2, grade 3, and grade 4 gliomas, and observed the same prognostic difference (p < 0.05 by logrank test).In addition, univariate and multivariate Cox regression analysis showed that the signature risk score (HR, 2.589; 95 % CI, 1.649-4.063;p = 3.54E-05) was an independent prognostic indicator of glioma in TCGA dataset (Table 1).Similar results were shown in the other three validation sets (p < 0.05).

Construction of an individualized prediction model based on the signature
To assess the predictive value of the signature for overall survival, we performed the timeROC analysis.The 1-year, 3-year and 5-year AUC of our signature risk score were 88.24 %, 92.79 %, and 87.86 %, superior to age (84.02 %, 83.85 %, 81.07 %) and grade (80.67 %, 85.43 %, 84.88 %) in TCGA dataset (Fig. 6A).We found the similar results in the other three validation sets (Fig. 6B and Figure S3).Next, we constructed a nomogram model to evaluate individual prognosis by integrating independent prognostic indicators (age and risk score) of TCGA dataset (Fig. 6C).The concordance index (C-index) of our nomogram was 0.867 and the calibration curve showed a satisfactory agreement between actual observations and model predictions (Fig. 6D).

Functional annotation of nucleotide metabolism-related signature
In order to explore the potential function of nucleotide metabolismrelated signature, we first performed Pearson correlation analysis to determine genes that were positively correlated with signature (R > 0.6, p < 0.05).GO analysis showed that these genes were mainly involved in biological processes including cell division, mitotic nuclear division, DNA replication, chromosome segregation and immune response (such as leukocyte migration, antigen processing and presentation, inflammatory response) (Fig. 7A).And KEGG pathway analysis showed that these genes were enriched in pathways including cell cycle, p53 signaling pathway, DNA replication and immune response (such as leukocyte transendothelial migration, TNF signaling pathway, NF-kappa B signaling pathway) (Fig. 7B).Meanwhile, GSEA analysis showed that similar functions were enriched in the high-risk group (Fig. 7C).These results suggested that nucleotide metabolism was not only related to cell division and cell cycle, but also affected immune response in gliomas.

Association between nucleotide metabolism-related signature and glioma immunity
To further understand the relationship between nucleotide metabolism and glioma immunity, we first calculated the immune score and stroma score for each glioma sample by ESTIMATE algorithm.We found that the immune score and stromal score of the high-risk group were significantly higher than those of the low-risk group (p < 0.05, Fig. 7D).Then, we performed CIBERSORT algorithm to quantitatively analyze the proportion of each immune cell subtype.We observed that resting or suppressive immune cells, including M0 macrophages and regulatory T cells, were significantly enriched in the high-risk group (p < 0.05, Fig. 7E).While activated immune cells, including activated mast cells, monocytes, and plasma cells, were significantly enriched in the low-risk group (p < 0.05).In addition, we selected representative immune checkpoint genes and signature risk score for Pearson correlation analysis.Chord diagrams showed positive correlations between risk score and the expression of immune checkpoint genes, suggesting that gliomas in the high-risk group might be immunosuppressive (Fig. 7F).
To better reveal the differences in the microenvironments of the two groups, single cell RNA sequencing (scRNA-seq) was performed on two IDH-wildtype glioblastomas.One sample had a high risk score (RS-high) and the other a low risk score (RS-low).Based on canonical markers for each cell type (Figure S4C), we identified four cell types, including tumor cells, macrophages, lymphocytes, and oligodendrocytes (Figure S4A and B).In RS-high samples, we confirmed a significantly higher proportion of tumor-infiltrating immune cells (including macrophages and lymphocytes, Figure S4D).And we preliminarily analyzed the functional phenotype of macrophages and found that the abundant macrophages in RS-high samples were mainly M2-like (anti-inflammatory, Figure S4E), while the macrophages in low-risk samples were mainly M1-like (pro-inflammatory, Figure S4F).This further confirmed the immunosuppressive status of tumors in the high-risk group.

Discussion
Metabolic changes in tumors have been observed for nearly a century, but in recent years the study of metabolism in cancer has gained renewed interest.Metabolic reprogramming is ubiquitous in cancer, giving tumor cells the ability to survive and proliferate under stress conditions [6].Common metabolic reprogramming in cancer includes glucose metabolism, lipid metabolism, amino acid metabolism, nucleotide metabolism and other bioenergy metabolism pathways [26].In proliferating tumor cells, increased nucleotide synthesis is required for DNA replication and RNA production to support protein synthesis during the cell cycle.Nucleotide metabolism and factors affecting this process have been confirmed to be involved in the development and progression of cancer [4].Minakshi et al. found that the lncRNA lincNMR affected tumor cell proliferation by regulating nucleotide metabolism [27].Madhusudhan et al. reported that mutated p53 drives and maintains cancer development by controlling nucleotide synthesis [28].In glioblastoma stem cells, Wang et al. revealed that high levels of pyrimidine metabolism were necessary to maintain the stem-like phenotype that promoted self-renewal and tumorigenesis [29].Norihiro et al. found that driving the synthesis of pyrimidine nucleotide enhanced the liver metastasis of colorectal cancer [30].In this study, we detected the nucleotide metabolic status of gliomas, and for the first time developed a nucleotide metabolism-related signature through comprehensive analysis of transcriptome data.We divided gliomas into high-risk or low-risk groups based on median signature, and found significant enrichment of older age, WHO grade 4, IDH-wildtype, 1p/19p intact, MGMT promoter unmethylation and mesenchymal subtype in the high-risk group.And ROC curve showed that our signature was a potential biomarker for mesenchymal subtype.Using the survival curve and univariate and multivariate Cox regression analysis, we found that signature was an independent prognostic factor of gliomas.This suggested that nucleotide metabolism might influence the prognosis of glioma patients.Combining signature and age, we further constructed a nomogram model to predict individual survival.
In addition to clinical and pathological differences, we performed an integrated analysis of genomic alterations.We found that mutations of EGFR, PTEN, NF1, PDGFRA and RB1 were more common in the high-risk group, while mutations of IDH1, ATRX, CIC, FUBP1 and NOTCH1 were more common in the low-risk group.In addition to gene mutation differences, we also found that copy number variation was more frequent in the high-risk group.These results revealed the relationship between genomic alterations and nucleotide metabolism in gliomas, suggesting that genomic alterations might be the root cause of nucleotide metabolic reprogramming in gliomas.
Functional analysis showed that signature was not only associated with cell division and cell cycle but also with immune responses, suggesting an interaction between nucleotide metabolism and immune environment.Recent studies have shown that abnormal nucleotide metabolism not only supports tumor cell growth but also regulates immune responses in tumor microenvironment.Purine nucleotides affect the cytotoxicity of immune cells by regulating the expression of immune ligand MICA [31].Adenosine enrichment in tumors decreased the infiltration of protective immune cells, including T cells and NK cells, while enhancing the role of immunosuppressive subsets, including regulatory T cells and MDSCs [32].Inhibition of purine nucleotide synthesis in tumor can increase the expression of immunoproteasome and enhance the response of CD8 T cells to anti-PD-1 immunotherapy [33].Regulation of nucleotide metabolism may potentially improve the response to tumor immunotherapy and immunosuppressive state in tumor, consequently inducing tumor cell death [34].To further understand the relationship between nucleotide metabolism and glioma immunity, we first performed ESTIMATE algorithm and found that the immune score and stromal score of the high-risk group were significantly higher than those of the low-risk group.And CIBERSORT analysis revealed that immunosuppressive subsets, including M0 macrophages and regulatory T cells, were enriched in the high-risk group.Pearson correlation analysis showed that the expression of immune checkpoint

Fig. 1 .
Fig. 1.Consensus clustering of gliomas with nucleotide metabolism-related genes.(A) Consensus clustering cumulative distribution function (CDF) for k = 2 to k = 10.(B) Relative change in area under CDF curve for k = 2 to k = 10.(C) Consensus clustering matrix of 702 glioma samples from TCGA dataset for k = 2. (D) Heatmap of the two clusters defined by the nucleotide metabolism-related genes (MAD > 0.5).(E) Survival curve analysis of glioma patients in two clusters.

Fig. 2 .
Fig. 2. Identification of a nucleotide metabolism-related signature in gliomas.(A) Partial likelihood deviance plot of 115 nucleotide metabolism-related genes in LASSO regression analysis.(B) 17 genes selected by LASSO and their corresponding regression coefficients.(C-D) Heatmaps show the clinical and pathological features of the high-risk and low-risk groups in TCGA and CGGA datasets.***P < 0.001.IDH, isocitrate dehydrogenase; MGMTp, methylguanine methyltransferase promoter.

Fig. 3 .
Fig. 3. Association between the nucleotide metabolism-related signature and pathological features in gliomas.(A-E) The distribution of signature risk score in patients stratified by WHO grade, IDH mutation status, 1p/19q codeletion status, MGMT promoter methylation, and TCGA molecular subtypes.(F) Receiver operating characteristic (ROC) curves show the predictive value of risk score for mesenchymal subtype.

Fig. 4 .
Fig. 4. Comparison of genomic alterations between the high-risk group and low-risk group.The upper part of the oncoplots show the somatic mutations, and the lower part show copy number alterations.Chi-square test, *P < 0.05.

Fig. 6 .
Fig. 6.A nomogram model for predicting the survival of glioma patients.(A-B) The timeROC curves to predict 1-year, 3-year, and 5-year overall survival of glioma patients based on risk score (RS), age, and grade in TCGA and CGGA dataset.(C) A nomogram model was constructed by integrating the signature RS and age in TCGA dataset.(D) Calibration curve of the nomogram model for predicting 1-year (red line), 3-year (blue line) and 5-year (black line) overall survival.

Fig. 7 .
Fig. 7. Functional and immune analysis of the nucleotide metabolism-related signature in gliomas.(A) GO analysis for genes positively correlated with signature.(B) KEGG pathway analysis for genes positively correlated with signature.(C) GSEA analysis revealed biological functions and pathways that were enriched in the highrisk group.(D) ESTIMATE analysis shows differences in immune scores and stromal scores between the high-risk and low-risk groups.(E) CIBERSORT analysis shows differences in the proportion of different immune cell types between the high-risk and low-risk groups.(F) Pearson correlation analysis of immune checkpoint gene expression and risk score.**P < 0.01, ***P < 0.001, ****P < 0.0001.

Table 1
Variables related to OS in gliomas: univariate and multivariate analysis.